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ABSTRACT 

In this paper, we propose a probabilistic generative model, 
called umfied model, which naturally unifies the ideas of so- 
cial influence, collaborative filtering and content-based meth- 
ods for item recommendation. To address the issue of hidden 
social influence, we devise new algorithms to learn the model 
parameters of our proposal based on expectation maximiza- 
tion (EM) . In addition to a single-machine version of our EM 
algorithm, we further devise a parallelized implementation 
on the Map-Reduce framework to process two large-scale 
datasets we collect. Moreover, we show that the social in- 
fluence obtained from our generative models can be used 
for group recommendation. Finally, we conduct compre- 
hensive experiments using the datasets crawled from last.fm 
and whrrl.com to validate our ideas. Experimental results 
show that the generative models with social influence signif- 
icantly outperform those without incorporating social influ- 
ence. The unified generative model proposed in this paper 
obtains the best performance. Moreover, our study on social 
influence finds that users in whrrl.com are more likely to get 
influenced by friends than those in last.fm. The experimen- 
tal results also confirm that our social influence based group 
recommendation algorithm outperforms the state-of-the-art 
algorithms for group recommendation. 

Categories and Subject Descriptors 

H.3.3 [Information Search and Retrieval]: Information 
Filtering; J. 4 [Computer Applications]: Social and Be- 
havior Sciences 

General Terms 

Algorithms, Experimentation. 

Keywords 

Recommender Systems, Probabilistic Generative Model, So- 
cial Networks, Social Influence, Group Recommendation. 
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1. INTRODUCTION 

As an indispensable type of information filtering tech- 
niques, recommendation systems have attracted a lot of at- 
tention in the past decade and have been successfully de- 
ployed in many e-commerce websites, such as Amazon and 
Netflix. Collaborative filtering (CF) and content-based tech- 
niques are two widely adopted approaches for recommenda- 
tion systems Q]. Collaborative filtering [HI [HI EH EI] rec- 
ommends items for a given user by referencing item ratings 
from other similar users, while content-based techniques |25] 
make recommendations by matching a user's personal inter- 
ests (or profiles) with item content (e.g., item description or 
tags). Some research works have also discussed approaches 
that integrate both techniques for item recommendation [281 
I36| . However, no emphasis has been placed explicitly on 
users' social influence in these works. In our real life, we 
usually turn to our friends for recommendations of books, 
movies or restaurants. As evident by the dramatic expan- 
sion of social media and social networking systems, social 
influence from friends presents new opportunities for recom- 
mendation systems but also bring many great challenges. 
In this paper, we aim to take social influence among users, 
along with user profile, user preference and item content, 
into the design of recommendation systems. 

To meet users' social demands, [151 1181 1191 1201 121] show 
that social influence is beneficial for item recommendations. 
The idea behind is that a user's friends may share common 
interests with the user, and have influence on the user's de- 
cisions. To incorporate the social influence to the recom- 
mendation system, [151 ITS] employ the random walk ap- 
proach [31] to incorporate user's social network for item 
recommendation. On the other hand, model-based systems 
were also been extended to include social influence [191 12UI 
121] . Assuming that trust intensities among a user and his 
friends are available, some prior works propose to integrate 
users' social trust network into their models through a lin- 
ear combination [191 120] or as a regularization term [21j . 
However, most of the proposed methods either apply ad hoc 
heuristics to include social influence to their methods or as- 
sume quantified prior knowledge of social trusts which is 
handily available. There is a need to define comprehensi- 
ble social influence, beyond random walking over the social 
network, to explicitly model and unveil the social influence 
from data available to the recommendation system. 

Owing to the success of collaborative filtering and content- 
based recommendation ideas, in this paper we propose to in- 
corporate social influence with these ideas in a unified fash- 
ion to design new recommendation systems. Through our 
design, we aim to demonstrate the importance and strength 



of social influence to recommendation services. To our best 
knowledge, ideas for unifying social influence with collabo- 
rative filtering and content-based recommendation are un- 
explored and very challenging. In this paper, we adopt the 
probabilistic generative model as a methodology to reach our 
goal. The basic idea behind probabilistic generative models 
is to "mimic" user behaviors in a process of decision making, 
e.g., deciding which restaurant to dine. While there ex- 
ists a prior study |28j on integrating collaborative filtering 
and item content into a probabilistic generative model for 
item recommendation, we want to point out that incorporat- 
ing social influence into the probabilistic generative model 
is nontrivial. Notice that the data used for CF and con- 
tent based techniques (i.e., the user-item accessing history, 
user profiles and item content) contain explicit observations. 
Thus, the notions of user preferences, user profiles and item 
content can be easily modeled. On the other hand, social 
influence cannot be observed directly from the data (i.e., we 
only know who access which item but never know if this de- 
cision is influenced by other people), we aim to introduce a 
latent variable and develop algorithms to capture the social 
influence between friends in addition of the latent variable 
for user's topics. 

The proposed probabilistic generative model is a latent 
class statistical mixture model. The model discovers (1) 
users' personal preference distribution over latent topicjj; 
(2) an item generative distribution for each topic; and (3) a 
social influence distribution from friends for each user. The 
generative model aims to capture the process of human be- 
haviors and/or reasonings for decision making. For example, 
a user (u) wants to choose a restaurant (i) for dinner. He 
may choose one based on his own tastes or turn to one of 
his friends (/) for help. In the case that u wants to choose 
the restaurant without any influence from his friends (with 
a certain probability), he chooses a topic according to his 
personal preference distribution. Then the selected topic in 
turn "generates" an item i following on the topic's item gen- 
erative distribution. In the case that social influence from a 
friend / is effective, / would generate an item following /'s 
preference distributions similarly. Thus, this model simu- 
lates the process that how u picks the item i, including how 
a friend / influences w's decision. 

As mentioned, both users' preferences and social influence 
among friends are latent variables. Thus, there is a need to 
devise new learning algorithms to estimate the model pa- 
rameters. In this paper, we address this issue by devising a 
new model learning algorithm based on the idea of expecta- 
tion maximization (EM). Moreover, due to the large volume 
of social network datasets and the excessive computational 
cost incurred in learning the generative model parameters, 
we devise a parallel algorithm under the Map- Reduce frame- 
work in addition to a single-thread algorithm, to process the 
large-scale datasets we collect. Finally, to demonstrate the 
flexibility and applicability of our ideas to other recommen- 
dation services that may utilize social influence, we adapt 
our probabilistic generative model and develop an algorithm 
to support group recommendations. The primary contribu- 
tions made in our research are summarized as follows. 

• We argue that social influence is important for item 
recommendations and devise probabilistic generative 

1 The term topic, from topic models, represents a genre of 
items in this paper. Take movies as an example of the items, 
a topic could be action, thriller, romantic or even a latent 
genre that cannot be expressed literally. 



models that explicitly quantify and incorporate social 
influence from friends to a user in the recommendation 
process. 

• We provide model learning methods (based on EM al- 
gorithms) to learn the model parameters from common 
user-item pairs. We implement the algorithms on sin- 
gle machine and parallel processing platform (based on 
the Map- Reduce framework [10]) to efficiently process 
large-scale data. 

• In addition to support item recommendation for indi- 
vidual users, we demonstrate that the quantified social 
influence parameter is essential for supporting group 
recommendations. Owing to the advantages of social 
influence learned in our model, the proposed social- 
influence-based group recommendation algorithms sig- 
nificantly outperforms conventional aggregation-based 
allgorithms. 

• We conduct a comprehensive performance evaluation 
on two real datasets crawled from last.fm and whrrl.com. 
Experimental results show that our proposal to incor- 
porate social influence into generative models for item 
recommendation techniques are very effective. The ex- 
perimental results for group recommendation also con- 
firm that the good estimation of social influence in our 
generative model is beneficial for group recommenda- 
tion. 

The remainder of this paper is organized as follows. Sec- 
tion [2] summarizes the related work and provide some back- 
ground on probabilistic generative models. Section [3] intro- 
duces the design of our generative model which combines 
collaborative filtering and social influence into recommen- 
dation process. Section [4] discusses how the EM algorithm 
is implemented on a single machine and on the Map- Reduce 
framework. Section [5] demonstrates how to incorporate so- 
cial influence, in addition of collaborative filtering and item 
content, into the probabilistic generative model. Section [6] 
reviews some previous group recommendation methods and 
proposes a new group recommendation method using the so- 
cial influence obtained from our model. Section shows the 
result of an empirical evaluation of our proposal using two 
real datasets. Finally, Section [8] concludes the paper. 

2. PRELIMINARY 

In this section, we introduce some related works, includ- 
ing recommendation systems, recommendation in social net- 
works and group recommendation. Then, we provide the 
background about how to utilize probabilistic generative 
model for item recommendations. 

2.1 Related Work 

Recommendation System. Item recommendation has 
been a crucial service for many e-commerce and web services 
(e.g. netflix.com and amazon.com). The goal is to recom- 
mend an accurate list of items that the targeted user may 
be interested in. Collaborative filtering and content-based 
techniques are two widely adopted approaches for recom- 
mendation systems pp. Both of them discover users' per- 
sonal interests and utilize these interests to find relevant 
items. Collaborative filtering techniques 1111 1131 1301 
1321 136] automatically predict relevant items for a given user 
by referencing item rating information from other similar 



users. Content-based techniques [25] make recommenda- 
tions by matching a user's personal interests (or profiles) to 
descriptive item information. Recommendation systems us- 
ing pure collaborative filtering approaches tend to fail when 
little knowledge about the user is known or when no one 
has similar interests with the user. For example, if a user 
has little item rating/selection history or his interests are 
rare compared to others, the item rating/selection history 
of other users cannot help. Although content-based meth- 
ods is able to cope with the issue of lacking knowledge, it 
fails to account for community endorsement. For example, 
even though we know a user is interested in Chinese restau- 
rants, content-based methods may possibly recommend a 
bad Chinese restaurant to him due to the lack of considera- 
tion in users' group consensus. As a result, there has been 
a continuous research interests and effort in combining the 
advantages of both collaborative filtering and content-based 
methods [3J 1281 [31 HZ] ■ Our proposal in this work not only 
is able to naturally integrate the ideas behind collaborative 
filtering and content-based methods but also incorporate so- 
cial influence into the recommendation process. 
Social Recommendation. Under the context of social 
networks, social friendship is shown to be beneficial for rec- 
ommendation [2B HS1 [13 H1IM1 [33]- However, prior 
works in this area are mostly based on ad hoc heuristics. 
How a user is influenced by friends in the item selection 
process remains vague. For example, [34] linearly combines 
social influence with conventional collaborative filtering; [151 
118] employ the random walk [31] approach to incorporate 
social network information into the process of item recom- 
mendation; while |21l I2UI I19j explores social friendship via 
matrix factorization technique, where social influence is in- 
tegrated by simple linear combination [201 119] or as a reg- 
ularization term [21] , 

In this paper, we propose to employ the probabilistic gen- 
erative model as a methodology to integrate social influence 
with collaborative filtering and content-based methods for 
item recommendation. Our work is uniquely different from 
these previous works because we do not assume social in- 
fluence is explicitly available. By leaning a quantitative pa- 
rameter for social influence, we are able to obtain a better 
understanding of the social influence and improve the per- 
formance of recommendation systems. Moreover, the quan- 
tified social influence obtained in our model can support re- 
lated applications such as group recommendation [2J [5] and 
viral marketing [29"1 151 112] . 

Group Recommendation. To explore how to utilize so- 
cial influence for group recommendation, we provides an 
in-depth study and comparison on group recommendation 
techniques. Group recommendations have been designed for 
various domains such as web/news pages [27], tourism [24j . 
music [231 [5] , and TV programs and movies [261 135] . In sum- 
mary, two main approaches have been proposed for group 
recommendation [16]. The first one creates an aggregated 
profile for a group based on its group members and then 
makes recommendations based on the aggregated group pro- 
file [23 , 35 . The second approach aggregates the recommen- 
dation results from individual members into a single group 
recommendation list. In other words, recommendations (i.e., 
ranked item lists) for individual members are created inde- 
pendently and then aggregated into a joint group recom- 
mendation list [2], where the aggregation functions could be 
based on average or least misery strategies [22]. Different 
from these proposed methods, our approach regenerates the 



process of how group members would express their prefer- 
ences and influence other members to reach the final deci- 
sion. Evaluation from real datasets demonstrates a signif- 
icant improvement over the proposed method using social 
influence over the traditional methods. 

2.2 Background 
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Figure 1: Probabilistic generative model- collabora- 
tive filtering 

The recommendation techniques we proposed in this pa- 
per are inspired by the probabilistic generative model devel- 
oped for collaborative filtering in [T3]. Let U = {u\, it2, ■ ■ ■ , itjv} 
and I = {ii,l2, • • • , itd } be the user set and item set, respec- 
tively. A latent topic set Z — {z\ ,z%,--- , zk} is assumed to 
capture latent user interests and item profiles. In the context 
of item recommendation, an event of a user u £ U access- 
ing an item i £ / is considered to be associated with one of 
the latent topic variables z G Z. Conceptually, as shown in 
Figure [1] user it chooses a topic z g Z according to his in- 
terest distributions, and in turn the topic z probabilistically 
"generates" an item i according to the distribution of items 
associated with z. Under this model, users are assumed to 
be independent of items given the chosen topic. The joint 
probability distribution over user it, topic z and item i can 
be written as 

Pr(n, z, i) = Pr(u) Pr(z|u) Pr(i|z), 

An equivalent specification of the joint probability distribu- 
tion that treats users and items symmetrically is 

Pr(u, z, i) = Pr(z) Pr(w|z) Pr(i\z) 

Since we are only interested in how likely a user u chooses 
an item i, the joint distribution over u and item i is 

Pr(u, i) = ^2 Pr ( u > z > *) = Pr ( z ) Pr H^) (1) 

This model has a set of parameters Pr(z), Pr(it|z) and 
Pr(i|z) for all z 6 Z, u £ U, i € J, which for simplicity is 
represented as 9. In [14], the user- item concurrence history 
H = {(u,i)}, which contains all the observed user-item, is 
used to learn the model parameters 9. One way to learn 9 
is to maximize the log-likelihood of history data which is: 

C(0)= Yl l°g(Pr(«,*|0)), (2) 

(u,i)eH 

where each Pr(«, i\9) can be found using model parameters 
as in Equation {]]). 

After model parameters are inferred, items can be ranked 
for a given user according to Pr(i|it), which refers to the 
probability that the user it selects the item i. Pr(i|it) can be 
computed as 

Pr( l » = ^^ocPr(u,i) (3) 

Since most recommendation systems only focus on recom- 
mending new items (items not presented in H for a partic- 
ular user it), items with the higher Pr(i|it) and not accessed 
by it are good recommendations. 



The probabilistic generative model described above is based 
on the ideas of collaborative filtering. Although [28] has 
extended the model to integrate item contents as an addi- 
tional component, social influence has not been considered 
yet. Moreover, as to be shown later, incorporating social 
influence into the generative model is fundamentally more 
challenging than integrating item contents into the model 
because item contents are observable from the training data, 
while the social influence is a hidden factor not directly ob- 
servable. In this paper, we demonstrate how to integrate 
social influence into this model and introduce our approach 
to infer model parameters. 

3. SOCIAL INFLUENCE IN ACTION 

In this section, we introduce our approach to incorporate 
social influence in a new probabilistic generative model for 
item recommendation. While the ultimate goal of our study 
is to unify the ideas of social influence, collaborative filtering 
and content-based methods as a model for item recommen- 
dation. For simplicity, here we first discuss how to integrate 
collaborative filtering and social influence into a probabilis- 
tic generative model. We will introduce the complete model 
(including collaborative filtering, item content and social in- 
fluence) later in Section [5] 
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Figure 2: Probabilistic generative model combining 
social influence and collaborative filtering 

Here we propose a new probabilistic generative model that 
describes the process of how a user selects an item by taking 
into account the user's own preferences and the social influ- 
ence from her friends. Let F(u) C U denote the friend list of 
a user u. The social influence introduced in this model aims 
to capture the scenario that one of it's friends (/ £ F(u)) 
has contributed his opinions in the item selection process. 
Here, for simplicity, we assume u is a special friend of him- 
self (i.e., u £ F{u)). Therefore, our model can be depicted 
as in Figure [2] As shown, a user it would first picks a friend 
(including himself) / £ F{u) to make the item selection. If 
the picked friend / happens to be himself (/ = u), u is not 
influenced by someone else in this selection. Nevertheless, 
if the picked friend / is not it, u is influenced by / at this 
time and thus the selected item follows /'s interests rather 
than it's own tastes. In this model, we define a parameter 
social influence distribution Pr(fkt) as the probability for 
u to be influenced by a friend /□ After / is chosen based 
on Pr(f\u), f randomly chooses a topic z according to his 
interests, and then the topic generates an item i according 
to the topic's item distribution. 

The joint probability distribution over users, friends, top- 
ics and items is as below. 

Pr(«, /, z, i) = Pr(u) Pr(/|u) Pr(z|/) Pr(i\z) (4) 

where u 6 U, i £ I, f € F(u) and z £ Z. 

The key observations from this model are 1) it, z and i 
are independently conditioned on /, and 2) u, f and i are 
independently conditioned on z. Because we intend to model 

2 We will demonstrate later that this parameter is very useful 
not only for item recommendation to individual users but 
also for group recommendation. 



the item selection probability in terms of social influence and 
topics (two latent parameters in our model), we transform 
Equation @ into the following form: 

Pr(t», /, z, i) = PrH/, z, i) Pr(/, z, i) = Pr(u|/) Pr(/, z, i) 
= Pr(z) Pr(«|/) Pr(/|z) Pi(i\z) 

(5) 

Thus, the joint distribution over users and items is: 

Pr(u,i) = J2 Pr(z) Pr(w|/ ) Pr(/|z) Pr(i|z) (6) 

z6Z /G-F(it) 

Different from the generative model in [14] (see Equa- 
tion {l])), the newly proposed model has two latent vari- 
ables, namely the topic variable (z) and the social influence 
variable (/). Correspondingly, the model parameters 8 now 
include {Pr(z),Pr(u|/),Pr(/|z),Pr(i|z)}. Note that the size 
of parameters is increased by \U\ ■ \F(u)\ for social influence 
Pr(w|/). Notice that while the friend space could potentially 
be the entire user space, the averaged number of friends 
per user is limited^ This is very important because it en- 
sures that our latent variable space is small enough and not 
to over-complicate the model. Moreover, the small latent 
parameter space yields high-quality parameter estimations 
even when the available history H is not large. 

In this study, we employ expected maximization (EM) to 
learn model parameters from the user-item history H. How- 
ever, the conventional expected maximization (EM) algo- 
rithm developed for single latent variable is not applicable 
for our model because we now have two latent variables, i.e., 
social influence and topics. To address this challenging is- 
sue, we have performed a detailed mathematical derivation 
to develop an new EM algorithm in order to infer the model 
parameters Q 

The derived EM algorithm iterates over the following steps: 

1. E-step: Computes posterior of the latent variables as 
Pr(z,/|u, i), V(w, i) £ H using the model parameters 
of previous iteration. 

2. M-step: Computes new model parameters by maxi- 
mizing the expected log-likelihood. 

In the E-step, we only need to use the previous iteration's 
model parameters to find Pr(z, f\u, i) as: 

Pr(z,f\u,i) 

Pr( 2 )Pr(/|z)PrH/)Pr(i|z) (7) 
E ze z Pr (*) Pl "(/k) Pr(u|/) Pr(*|*) 

Also note that we only need to compute the posteriors of 
those pairs presented in the history H instead of all the pos- 
sible user-item pairs, because the expectation to be maxi- 
mized only weights on the observed user-item pairs. 

The M-step shall find new model parameters to maximize 
the expected log-likelihood found in the E-step. According 
to our derivation, the new model parameters should be up- 



3 In our collected real data sets, the averaged number of 
friends per user is less than 10. 

4 Due to space limit, we present the algorithm here but keep 
the EM algorithm derivation in the appendix. 



(8) 



dated as 

Pr(0)oc E Pr(*,/V,0 

Pr(u|/)oc £ £ Pr( z ',/| U) i') 

(«,i')6HA/€F(u) z'SZ 

Pr(/|2)oc £ Pr(^/|«',i') 

Pr(i|z) cc ^ ^ Pr(2,/'[w',i) 

{u',i)£H /'GF(u') 

Equation <[Sj shows that for each parameter distributions, 
the new number should be chosen as normalized correspond- 
ing posterior sums. For example, Pr + (/|z) is obtained by 
taking the sum of all the related latent variable posteriors for 
the / and z. Then, because of P r+ (/l 2 ) = 1> we need 

to normalize the posterior sums with regarding to different 
/ to update the correct model parameters of Pr + (/|z). 

By repeating the E-step and M-step, the EM Algorithm 
improves the model parameters iteratively until they con- 
verge to a local log-likelihood maximum. The learned model 
parameters are used for item recommendations by ranking 
items for a given user according to 

Pr(i| W )oc ]T Pr(u, /,*,») (9) 

f£F(u),z£Z 

which can be calculated by Equation ((5}. 

4. LEARNING ALGORITHMS 

In this section, we discuss how to implement the EM al- 
gorithm efficiently to learn the model parameters. Here we 
first present our algorithm for single machine. A challenge 
encountered in our initial research effort is that the EM algo- 
rithm, while fine tuned, is still slow due to excessive compu- 
tation incurred in processing large-scale datasets. To over- 
come this challenge, we develop a parallel processing ver- 
sion of the EM algorithm on the Map-Reduce framework. 
Through this effort, we demonstrate that our design of the 
EM algorithm can be elegantly decomposed for efficient par- 
allel processing using Map-Reduce. 

4.1 Single Machine Algorithm 

We first show an implementation that efficiently realize 
our EM algorithm on a single machine. For simplicity, we 
only present one iteration of the E-step and the M-step, 
which aims is to approach the model parameters 9 x +i based 
on the current approximate value of parameters 6 X . 

Algorithm [1] executes one EM iteration to find the next 
model parameters 9 x +i- Notice that we do not execute 
the E-step separately. Because we only need to compute 
Pr(/, z\u, i) once for each user-item pair observed in H, we 
embed the E-step computation in the M-step so the poste- 
riors are computed only as needed. Therefore, for each ob- 
served {u, i) , the E-step is executed once in line [4] and the 
M-step is executed from line [5] to accumulate latent vari- 
able posteriors into the corresponding posterior sums (e.g. 
Pr + (/|z) now takes the sum of all the posteriors with the 
same friend and topic ids). After all the observed user- 
item pairs are examined, M-step need to normalize poste- 
rior sums as next iteration's parameters in line [5] These 
accumulation and normalization steps realize the M-step in 



Algorithm 1: Social Influence EM Algorithm 

Input: Data Set: H = {(u, i)}, Model parameters: 

9 X = {Pr(«|/),Pr(/|«),Pr(i]z),Pr(z)} 
Output: Next Parameters: 

x +i = {Pr+W/),Pr+(/|z),Pr+(i|z),Pr+(z)} 

1 for {u, i) € H do 

2 for / e F(u) do 

3 for z £ Z do 

4 Compute Pr(/, z\u, i); 

5 Pr+(.f|*) <-Pr+(/|*)+Pr(/,*Ki); 

6 Pr+(i|z) <- Pr+(i|z) + Pv(f, z\u,i); 

7 Pr+(«|/) <- Pr+(u|/) + Pr(/, z\u,i); 

8 _ Pr+(z) <- Pr+O) + Pr(/, z\u,i); 

9 Normalize Pr+(z), Pr+(/|z), Pr+(i|z), Pr+(u|/); 

10 return 6 x+ i = {Pr+ (u\f), Pr+ (/|z), Pr+ (i\z), Pr+ (z)} 



Equation ([8]). The running time for this EM algorithm is 
0(\H\ ■ \Z\ ■ \F(u)\), where \H\ is the total number of ob- 
served user-item pairs, \Z\ is the latent topic size and |.F(u)| 
is the average number of friends per user. 

4.2 Parallelized Map-Reduce Algorithms 

In this section, we show how we decompose the Algo- 
rithm [1] for parallel processing. Notice that there are three 
computation components in one EM iterations: 1) E-step to 
compute posteriors Pr(/, z\u, i); 2) Accumulate posteriors 
to posterior sums; and 3) Normalization step to obtain the 
model parameters for next iteration. Among them, Step 1 
and 3 could not be parallelized because Step 1 requires the 
knowledge about all the related model parameters 6 X and 
Step 3 requires the entire set of 8 x +i for parameter normal- 
ization. Therefore, based on the design principle of Map- 
Reduce algorithms, we execute Step 3 of previous iteration 
along with Step 1. As such, the non-parallelizable compo- 
nents are combined to avoid overhead of another round of 
Map- Reduce to achieve the same task. 

Algorithm 2: Social Influence EM Mapper Algorithm 

Input: Partial Dataset: H x = {(u, i)}, Un-normalized 

model parameters: 9 X = {Pr(?i|/), Pr(/|z), Pr(i|z)} 
Output: Intermediate probabilities in key value pairs. 

1 Normalize Pr(z), Pr(/|z), Pr(i|z), Pr(w|/); 

2 for (u,i) € H x do 

3 for / g F(u) do 

4 for z S Z do 

5 Compute Pr(/, z\u, i); 

6 [_ Pr(/|u, i) «- Pr(/|u, i) + Pr(/, z]«, i); 

7 Emit key: /,value: (Pr(/, zq\u, i), Pr(/, z\ |u, i) ■ ■ ■ ); 

8 Emit key: i, value: (Pr(/, zo\u,i), Pr(/, zi|u, i) — ); 

9 Emit key: u, value: (Pr(/o|«, i), Pr(/i|u, i) ■ ■ ■ ); 



Algorithm 3: Social Influence EM Reducer Algorithm 

Input: Grouped intermediate probabilities. 
Output: Un-normalized next parameters: 

8 X+1 = {Pr+(n|/),Pr+(/|z),Pr+(i|z)} 
1 for key = K, values = (Vb, Vi, — ) from input do 
2 
3 



Emit key : K, value: V; 



The algorithms for Mapper side and Reducer side are 



shown in Algorithm [2] and Algorithm O respectively. At 
its start, each mapper normalize posterior sums from previ- 
ous results to construct model parameters 6 m (see line [TJ. 
Then the user-item pairs are processed in parallel at differ- 
ent mappers (line[2]| because each mapper now has the same 
global knowledge of 9 X . And the accumulation step is done 
in reducer to find posterior sums of for next iteration. 

Since each mapper only processes a portion of the user- 
item history, one mapper does not have the entire knowledge 
of (u, i) G H , and thus cannot accumulate correct posterior 
sums (e.g. Pr + (/|z)). To address this problem, we move the 
accumulation step to reducers (Algorithm [3). In particular, 
we park posteriors for a specific user, item or friend id to- 
gether and emit the packed value (Iine l7l9 |l. To ensure these 
posteriors are correctly accumulated in reducer, a standard 
shuffle-and-sort is performed to all the emitted key-value 
pairs. In this way, the Map-Reduce framework ensures that 
all the emitted values for the same key are grouped and pro- 
cessed together in the same reducer. 

Thanks to the shuffle-and-sort step, the reducer algorithm 
(Algorithm [3} is very simple, which only takes a sum of all 
the grouped values and output the sums. Let us take the 
Pr + (/|z) computation as an example to understand how the 
reducer performs its task as desired. When a mapper run 
algorithm O only a part of {u, i) pairs are processed and the 
posteriors with respect to a particular / are emitted to the 
mapper outputs. Although emitted key-value pairs with the 
same key (/) may come from different mappers, but they 
are grouped by key (/) after keys are shuffled and sorted. 
Because now all the values (packed with posteriors) for the 
same key / are grouped, the reducer can simply sum the 
posteriors to find the correct posterior sums. Similar steps 
can be done for all the other parameter computations, i.e., 
(Pr + (i|z) and Pr + (u|/)). A reducer does not need to dif- 
ferentiate key types, because the accumulation steps are the 
same for user/item/friend ids. Recall that these reducer out- 
puts are not the 8 x +i yet. What left is the normalization for 
each posterior sums to do in the next Map- Reduce iteration. 

The above Map-Reduce EM algorithms addresses the scal- 
ability issue in learning model parameters. We find the Map- 
Reduce framework is quite suitable for expediting our EM 
algorithm because the posterior computation and accumu- 
lation (which incur significant cost) can be done in parallel. 

5. UNIFIED GENERATIVE MODEL 

As mentioned earlier, we aim to developed a new genera- 
tive model to unify the ideas of social influence, collaborative 
filtering and content-based methods for item recommenda- 
tion. In this section, we present the unified generative model 
developed for our ultimate goal. 




Figure 3: Unified generative model integrating so- 
cial influence, collaborative Altering and content- 
based methods 

The unified generative model depicting a more general 
process of item selection is shown in Figure [3] As shown, 



the early steps of the process is similar to the model in- 
troduced earlier (see Figure [2]). However, the selected topic 
now generates an item i and a content description w. There- 
fore, a topic in this new model is not only associated with a 
distribution of items but also a distribution of item content 
(e.g., tag words). Notice that here we assume items and con- 
tents are independently conditioned on the topics. As such, 
the similarity of item contents is taken into account in the 
recommendation process. As a result, the joint probability 
distribution over all factors is: 

Pr(u, /, z, i, w) = Pr(u) Pr(/|u) Pr(z|/) Pi{i\z) Pr(«)|z) 

(10) 

where w G W and W is the space of possible item contents. 
For example, w could be a word of the content vocabulary 
or a tag of the tag space. Similar to Equation |(3J, the joint 
probability distribution can be rewritten as: 

Pr(u, /, z, i, w) = Pr(z) Pr(u[/) Pr(/|z) Pr(i|z) Pr(w|z) 

(11) 

Now the remaining issue is to learn the set of all the model 
parameters 9. Different from what we discussed earlier in 
Section[3] the dataset used for learning now consists of three 
elements, including users, items and tags, i.e., (u, i, w) G H, 
where u £ U , i € I , and w G Wi (i.e., Wi denotes the 
tag/word set associated with item i). Note that an item 
may contain multiple tags/words. For a history record of a 
user u selecting an item i where Wi = {wi, W2, • • • }, we have 
(u, i, Wk) G H , k — 1 , 2, ■ • ■ . Notice that 9 now includes an 
extra parameter P(w\z)(yz G Z,w G W) in addition to the 
other model parameters discussed earlier in our preliminary 
generative model (see Section [3}. The approach of learning 
model parameters is still to maximize the log-likelihood of 
C(9). However, the details are different. 

In E-step, instead of computing the expectation of the 
log- likelihood for individual latent variables (e.g., z or / in- 
dividually), we propose to compute the expectation of the 
log- likelihood for the joint latent latent variables (i.e., z and 
/ together). More specifically, we calculate 

Pr(z,f\u,i,w) 
_ Pi(z) Pr(/|z) Pr(w|/) Pr(ijz) Pr(w\z) 

~E, 6Z E /eF(ll) Pt(z) Pr(f\z) PrH/) Pr(i|z) Pr(w\z) 

(12) 

In M-step, model parameters are computed to maximize 
the expected log-likelihood found on the E-step as below. 

Pr(i|2)oc Yl Pr (^/'l 

u , i, w') 

(«' ,i,w')eH f'eF(u') 
Pr(iu|z) oc ^ ^ Pi(z, f\u',i',w) 

PrH/) oc Y^{z ,f\u,i' ,w') ( 13 ) 

(u,i',m')eHA/eF(u) z'ez 

Pr(/|z) oc Pr(z, f\u, i' , w') 

Pr H oc ^2 Pr(z,f'\u',i',w') 

(u',i',w')eH f'€F(u') 

The parameters maximization method is similar to Equa- 
tion JS). Note that the summed latent variable posterior is 



different and that we have an additional set of parameters 
in Pt + (w\z). 

After the model is learned, items can be ranked for a given 
user based on Pr(i|w), which can be approximated by Equa- 
tion ©, in which 

Pr( u , o = X) Z) Pr W Pr ^i z ) Pr (/i z ) Pr H/) Pr H^) 

Note that we are only interested in the tags/words associ- 
ated with the given item when we calculate the user-item 
joint probability. Item i with high Pr(z|u) that user u has 
not yet accessed is a good candidate for recommendation. 

6. GROUP RECOMMENDATION 

Given a group of people G, group recommendation aims 
to find items that are welcomed by the whole group instead 
of individual group members. This recommendation service 
has a very large application base, e.g., coordinating a group 
of people to find quality activities/venues/restaurants/movies, 
etc. Although the generative models we proposed earlier are 
targeting on item recommendation for an individual user, 
the social influence parameter learned in our models is very 
useful for group recommendation. In this section, we first 
introduce the state-of-the-art approaches for group recom- 
mendation, namely aggregation-based group recommendation 
and then discuss how to apply the quantified social influence 
obtained from our models to develop a new algorithm, called 
social influence based group recommendation, for group rec- 
ommendation. 

6.1 Aggregation-based Recommendation 

For group recommendation, one popular approach is the 
ranking aggregation method which finds a "consensus" rank- 
ing/score for each item for the whole group. Given individ- 
ual ranking/score for each member, some aggregation meth- 
ods are employed to obtain a group ranking/score from indi- 
vidual ranking/scores. In this paper, we review two popular 
aggregation strategies: average and least misery recent pro- 
posed in [22] . 

Average - With the average aggregation strategy, an item 
z's group score is defined as an average of the scores from 
individual group members. By using the item access proba- 
bility estimation Pr(i|tt) as the score, the group score for an 
item i to group G is calculated as 



»(G,i) 



\G\ 



(14) 



Accordingly, the recommendation ranking can be computed 
by sorting the group scores in descending order. 

Least Misery - With the least misery aggregation strat- 
egy, the group score for item i to a group G is equal to the 
smallest predicted rating for item i in the group, specifically 



f(G,i) 



min{Pr(i|ii)} 



(15) 



Following this strategy, whether an item is acceptable to the 
group depends on the least satisfied member. Basically, the 
item least disliked by each individual member shall has the 
highest group score for recommendation. 

These two aggregation-based group recommendation ap- 
proaches captures a group consensus of item ranking by con- 
sidering all the decisions made by users to be independent 
and equally important. However, in a group activity, people 
interact with each other and thus influence each other. We 



aim to address this observation in our social influence based 
group recommendation algorithm. 

6.2 Social Influence based Recommendation 

Note that we restrict the recommendation to a group 
where every group member has at least one friend in the 
group. Within such a group, friends may influence each 
other so there may exist a group consensus. While our gen- 
erative models aim to capture the process where a given user 
u (influenced by a friend /) selects an item i, we can also see 
the process as a group activity, i.e., u is influenced by / to 
jointly select item i. Intuitively, our models can be used di- 
rectly to support group recommendation for "two-member" 
groups. 

Let G2 = {111,112} denote a "two- member" group. To se- 
lect an item for the group, user tii could influence user U2 
and vice versa. Therefore, we define the score for recom- 
mending an item i to the group G2 as 



Sinfluence(G2,i) = Pr (ui , U% , l) + Pt (U2, Ml , l) (16) 



where 



£)Pr(*)Pr(u|/) Pr(f\z) Pr(i|z) 



(17) 



can be easily obtained from the model parameters of our 
generative models. 





one four-member group 





four two-member groups 



Figure 4: Decompose an arbitrary group into a set 
of two-member groups. Edges between nodes denote 
online friendship. 

The ideas described above can be generalized for groups 
with more than two members. A group of more than two 
members can be decomposed into a set of two-member groups 
based on the friendship of members (see the example in Fig- 
ure [4] for illustration). To make a group recommendation, 
we assume the social influence only takes action between 
friends. Intuitively, if most pairs of friends in the group 
prefer a particular item, it would be a good candidate for 
recommendation to the group. Let G denote a group with 
arbitrary cardinality. The score for recommending an item 
i to G is defined as the sum of Smfl U ence(G2, i) score over all 
possible friend pairs in the group. Formally, 

ShiflucncctG, z) = S ^ 'S'influence ({li, /}, l) 

V(uJ)6GxG,»#/,/£F(u) 

(18) 

The ranking of items for group recommendation is based on 
the sorted group scores of items as defined above. Thus, the 
decision of selecting an item for the group naturally incor- 
porates the social influence among members of the group. 
We find superior performance of our social-influence strat- 
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egy over the two aggregation strategies (to be shown in our 
evaluation) . 

7. PERFORMANCE EVALUATION 

In this section, we validate our proposed probabilistic gen- 
erative models using two real datasets, one from last.fm 
and the other from whrrl.com. We develop web crawlers 
to collect theses two datasets, which include user-item ac- 
cessing history, users' friendship network and tags associated 
with each item. Besides, we collect group check-in history 
data from whrrl.com to validate our group recommendation 
approach. In our evaluation, we adopt the user-based col- 
laborative filtering approach (denoted as CF) as a baseline 
and propose to study the effectiveness of different factors 
(i.e., social influence, collaborative filtering and item con- 
tent) included in our unified generative model. The different 
configurations of factors included on our evaluation are: 1) 
CF factor (CF-PGM) (see Figured}, 2) combination of CF 
and social influence factors (CF+SI-PGM) (see FigureE), 3) 
combination of CF and item content factors (CF+IC-PGM) 
(this has been discussed in [28]), and 4) combination of CF, 
social influence and item content factors (CF+SI+IC-PGM) 
(i.e., our unified model; see Figure|3}. In this evaluation, we 
conduct a comprehensive set of experiments for item recom- 
mendation (to a single user) and group recommendation. 

7.1 Dataset Description 

Here we first provide information about the datasets, i.e., 
last.fm and whrrl.com, used in our experiments. Last.fm 
is an on-line music radio web service and whrrl.com is a 
location-based social network web service. The last.fm dataset 
contains music access history of 3, 143 users over 23, 467 
unique songs; while whrrl.com dataset includes the check- 
in history of 7, 145 users to 74, 217 unique places. It is 
worth noting that the whrrl.com dataset includes 17,587 
group check-in records which are very valuable for evalu- 
ating the group recommendation approaches. Additionally, 
both datasets have their user social networks available. The 
basic statistics of these two datasets are summarized in Ta- 
ble [TJ Cumulative distributions with respect to the num- 
ber of items accessed by users (User Items), the number of 
friends of users (User Friends), and the number of tags as- 
sociated with items (Item Tags) are shown in Figure [5] and 
Figure [6] for last.fm and whrrl.com, respectively. 

7.2 Parameter Initialization and Training 

After the datasets are prepared, we are able to apply EM 
algorithms to infer model parameters. However, for all EM 
methods, model parameters need to be initialized and the 
iteration termination condition needs to be specified. Af- 
ter experimenting with different model parameter initializa- 
tion methods, we decided to use Latent Dirichlet Allocation 
(LDA) [6| to initialize the model parameters. Although LDA 
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Figure 5: Cumulative Distributions (last.fm) 
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has been mainly used for clustering documents, it has simi- 
lar parameters as the CF-PGM in FigureQ] To obtain initial 
parameters from LDA, we treat each user as a "document" 
in the LDA model and transform items accessed by the user 
as "words" in the document. After the LDA model con- 
verges, we discard the document clustering but keep Pr(u|z) 
and Pr(z|i) as the model initialization values for EM algo- 
rithms. For the social influence parameters Pr(w|/) required 
in CF+SI-PGM, we use the normalized Jaccard similarity 
as the initial values (i.e., Pr(u\f) oc JaccardSim(/(it), /(/)), 
where I(u) and /(/) denote the accessed items for u and /, 
respectively). This initialization ensures friends having more 
commonly accessed items to have a higher social influence 
initially. Note that in our models, u is treated as a spe- 
cial friend of himself. Since JaccardSim(U(/), V(f)) = 1, a 
user's self influence is always larger than any social influence 
from his friends at the beginning. To terminate the EM al- 
gorithms, we use log-likelihood as model converge indicators 
and terminate the EM algorithms when an additional EM 
iteration cannot improve the training data's log-likelihood 
by 0.0001 or when the maximum iteration threshold (empir- 
ically set with 50) is reached. 

We implement both the single machine EM-algorithm and 
its Map-Reduce version and confirm that both implemen- 
tations produce the same results with small datasets. For 
those more complicated models (i.e., CF+SI-PGM and 
CF+SI+IC-PGM), we apply our Map-Reduce implementa- 
tion on a cluster of 10 machines to infer the model parame- 
ters. 

7.3 Item Recommendation 

We use item recommendation as the primary test case 
to evaluate the performance of the probabilistic generative 
models under evaluation. We apply cross-validation method 
to find item recommendation's precisions and recalls. For 
both datasets, we mark off 30% item assess history of each 
user for testing. In other words, the rest 70% user-item pairs 
are used as training data to infer model parameters. Then 
after each model is learned, we use the model parameters 
to find Vi,Pr(i|u) for all users. The items not in presence 
in the training dataset are ranked based on their (Pr(i|w)). 
In this way, we prevent our recommendation system from 
"repeating" a user's item access history. Therefore, all the 



recommendations for a user must be "fresh" items that have 
not been accessed by him in the training dataset. The preci- 
sions and recalls for top n recommendations are used as the 
evaluation metrics, where n = 5, 10, 20, 50 (5 is the default 
value). Precision is calculated as the ratio of the number of 
recommendation hits to the recommendation size; and recall 
is calculated as the ratio of the number of recommendation 
hits to the size of user's validation item set. Then the av- 
erage precisions and recalls of different users serve as the 
evaluation metrics. 
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Figure 7: Test Topic Sizes (last.fm) 





Methods 


Top 5 


Top 10 


Top 20 


Top 50 


Precision 


CF-PGM 


03QQ 


0.0372 


0.0342 


V . V L> KJ. 


CF+SI-PGM 


0.0427 


0.0429 


0383 


0.0315 




0470 


0542 


fl 04Q4 


\J .UJOO 




0.0492 


0566 


0.0506 


0398 


OF 


0.0046 


0.0066 


0.0080 


0.0085 


Recall 


CF-PGM 


0.0048 


0.0085 


0.0157 


0.0329 


OF+3I-PC2M" 


0.0050 


0.0100 


0.0187 


0.0363 


CF+IC-PGM 


0.0054 


0.0102 


0.0198 


0.0385 


CF+SI+IC-PGM 


0.0057 


0.0117 


0.0213 


0.0411 


GP 


0.0010 


0.0024 


0.0050 


0.0122" 


Table 2: Performance on last.fm dataset 




Methods 


Top 5 


Top 10 


Top 20 


Top 50 


Precision 


CF-PGM 


0.0048 


0.0038 


0.0035 


0.0028 


CF+yi-PGM 


0.005S 


0.0041 


0.0036 


0.0028 


CF+1C-PGM 


0.0059 


0.0048 


0.0040 


0.0029 


CF+yi+lG-PGM 


o.oota 


0.0049 


0.0041 


0.0030 


OF 


0.0015 


0.0016 


0.0015 


0.0011 


Recall 


CF-PGM 


0.0062 


0.0090 


0.0141 


0.0251 


CF+yi-PGM 


0.006!) 


0.0100 


0.0146 


0.01254 


OF+1C-PGM 


0.0076 


0.0119 


0.0157 


0.0252 


CF+yi+lG-PGM 


0.0081 


0.0115 


0.0154 


0.0275 


OF 


0.0020 


0.0033 


0.0051 


0.0071 



Table 3: Performance on whrrl.com dataset 




X-CF-PGM 

CF+SI-PGM 
A'CF+IC-PGM 

< ft ^g;+ sl + lc - pGM o 



0.012 
0.01 
0.00J 
• 0.001 
0.004 



°20 



# of Topics 

(a) Precision 



'-fl- . 
— * » 

-X-CF-PGM 
-t-CF+SI-PGM 
A'CF+IC-PGM 
0.002 A 1 " a " CF+SI+ IC-PGM ^ 
-0-CF | 

20 40 60 80 100 120 
# of Topics 

(b) Recall 



Figure 8: Test Topic Sizes (whrrl.com) 

In Figure [7] and Figure [HI the precision and recall of top 5 
item recommendations for last.fm and whrrl.com under dif- 
ferent latent topic sizes are presented. We find social influ- 
ence indeed improves the recommendation performance, for 
both CF+SI-PGM against CF-PGM and CF+SI+IC-PGM 
against CF+IC-PGM. The result shows that the best rec- 
ommendation performance is reached when the topic size is 
chosen around 60. Therefore, we set the default value of the 
latent topic size to 60 for the remaining part of performance 
evaluation. 

In Table [2] and Table [3j we compare the item recommen- 
dation performance of different algorithms. As shown in 
these two tables, all the probabilistic generative model ap- 
proaches clearly outperform the conventional user-based col- 
laborative filtering (CF). Again, we find social influence fac- 
tor indeed improves the recommendation performance, (for 
both CF+SI-PGM against CF-PGM and CF+SI+IC-PGM 
against CF+IC-PGM). Most importantly, the unified model 
we propose in this paper (which integrates collaborative fil- 
tering, social influence and item content) shows the best 
performance. 

By comparing results from whrrl.com and last.fm datasets, 
we find that social influence is more important (in terms 
of item recommendation) in whrrl.com than last.fm. One 
possible reason is that in our collected datasets, users in 
whrrl.com are more social than users in last.fm, i.e., the av- 
erage number of friends in whrrl.com is 9.08 compared to 
last.fm's 1.91. We also observe this phenomenon from the 



statistics shown in Figure [5{b) and Figure [6jb). In other 
words, it is more likely for users in whrrl.com to be influ- 
enced by their on-line friends than users in last.fm. Con- 
sequently, the recommendation performance benefit from 
social influence in last.fm is less significant than that in 
whrrl.com. 

7.4 Social Influence Study 

In this section, we aim to study the social influence be- 
tween friends, where the social influence is learned through 
our proposed models. For simplicity, we focus on CF+SI- 
PGM. Instead of investigating how social influence improves 
the recommendation performance, here we are interested in 
how significant a particular user influence his friends. As 
different people have different personalities, we plot the dis- 
tributions of social influence probabilities among friend pairs 
(that we learned through CF+SI-PGM) in Figure[5]and Fig- 
ure [10] Note that we also consider the circumstance of self- 
influence and use Pr(u\u) to denote the probability. 
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Figure 9: Social Influence Result (last.fm) 

Figure [9] shows the learned social influence presented in 
last.fm. In general, people's self-influence (Pr(u|u)) in this 
dataset is significantly higher than the influence from their 
friends (Pr(u|/), f ^ u) when choosing a music piece. Fig- 
ure [9(a)] shows that more than 90% users' self- influences are 
higher than 0.95. Also, since each user may have several 
friends, each friend's social influence is thus quite small, 
e.g., 90% friends' influence is smaller than 0.01. This ob- 
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Figure 10: Social Influence Result (whrrl.com) 
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servation indicates that most music pieces consumed by the 
users in last.fm are selected in accordance with users' own 
preferences and tastes. 

Figure [TO] demonstrates very different findings. As shown 
in Figure 10(a) the self-influence is still quite significant 



but much smaller than that in last.fm, i.e., 10% users' self- 
influence is lower than 0.8. The implication from this finding 
is that while people visit places mainly based on their own 
preferences, they would sometimes take friends' suggestions 
to visit places. While users in whrrl.com have more friends 
than users in last.fm, (i.e., average 9.08 friends compared to 
1.91), we find a lot of friends are not influential and that 
usually a small portion of friends takes the most part of so- 
cial influence. In general, people's social influence in place 
check-in activities are much more significant than music con- 
sumption — one explanation is that check-ins are inherently 
social activities and music consumption are usually for self- 
entertainment. 

7.5 Group Recommendation 

Finally, we report our findings on evaluation of group 
recommendation algorithms, including the social-influence 
based (SIG) algorithm introduced in Section [6] along with 
two aggregation-based group recommendation strategies. To 
compare their performance, we use the 17, 587 group check- 
in records in whrrl.com. In our experiment, we consider a 
group check-in record (i.e., the ground truth) at a time and 
take the average of tested records. Notice that a record in- 
dicates a group of people visits a place. An effective group 
recommendation algorithm should have this place ranked 
high among all the places returned. Therefore, we propose 
a metric called relative ranking to evaluate the performance 
of these group recommendation algorithms. Suppose that 
a given group recommendation algorithm returns a ranked 
list of m items (i.e., places in this experiment). If the actual 
visited place is ranked in the /-th position of the returned 
list, the relative ranking is calculated as — . For example, if 
an actual visited place is ranked 10th among a total of 100 
items returned by a group recommendation algorithm, the 
relative ranking is 10/100 = 0.1). 

In Figure 1111 we compare the performance of our SIG 
group recommendation method with the other two aggregation- 
based strategies, i.e., Average and Least Misery. The values 
in Y-axis represent the relative rankin gs of a ctual visited 
places (the lower the better). In Figure ll(a)| we find that 
SIG outperforms the Average and Least Misery strategies 
for most of the varied group sizes. However, the larger the 
group is, the smaller improvement is reached from SIG. This 
finding implies that for smaller groups, the social influence 
among group members plays a major role in item selection 
for the group. However, for larger groups, the group con- 



sensus aggregated from individual preferences may dominate 
the group decision. This finding is consistent with our com- 
mon experience that in activity planning for a smaller group, 
one or two influencing members may significantly determine 
the activity venue. On the other hand, for a large group, 
the social influence from individuals may be difficult to take 
effect on the entire group. As a result, the group's common 
interest dominates. Next, we evaluate the three group rec- 
ommendation strategies by varying topic size. The result 
shown in Figure |ll(b)| indicates that SIG always outper- 
forms the other two and reach its optimal point when the 
topic size is configured to around 60. 

8. CONCLUSION 

This research attempts to explore social influence for item 
recommendation. We propose a probabilistic generative model, 
called unified model, which naturally unifies the ideas of 
social influences, collaborative filtering and content-based 
methods in the recommendation process. To address the 
issue of hidden social influence in the available datasets, 
we devise new algorithms to learning the model parame- 
ters based on the idea of expectation maximization (EM). 
Moreover, we provide a Map-Reduce implementation, in ad- 
dition to a single-machine version, of our EM algorithm to 
process large-scale datasets. Furthermore, by exploring the 
social influence quantitatively captured in our models, we 
develop a social influence based group recommendation al- 
gorithm to demonstrate the strength of our proposed models 
on group recommendation. Finally, we conduct a compre- 
hensive experimental study to evaluate the performance of 
our proposal for item recommendation to individual users 
and to groups. Experimental results show that the unified 
probabilistic generative model proposed in this paper ac- 
commodates different factors very well to achieve a superior 
recommendation performance over other alternatives. Our 
experimental results also facilitate a better understanding of 
the social influence between friends in social networks. It is 
interesting to note that users in whrrl.com are more likely to 
be influenced by friends than those in last.fm. Finally, our 
experimental result also confirms that our social influence 
based algorithm outperforms the state-of-the-art algorithms 
for group recommendation. 
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APPENDIX 

A. EM ALGORITHM DERIVATION 

The EM algorithm is a way to find model parameters to 
achieve local maximum of log-likelihood function (i.e. Equa- 
tion ©)• Since direct maximizing C(9) is difficult, EM algo- 
rithm applies an iterative method to improve model param- 
eters step by step. Starting from the log- likelihood C{9), we 
have: 

£(<?)= log Pr{u,i\0) = J2 IogPr(«,*|0) 

= E logX)Pr(u,i,*,/|0) 

(»,i)efl *,/ 



= E i«g E Pr ^^^ ; 

(v,i)£H \z,f 

^ E E^'/i"'*'^) 1 ^ 

(u,i)£H z,f 

^ Q(9\9 X ) 



Pr(u,i,z,f\e) 
Pr(z,f\u,i,8 x ) 

Pr(u,i,z,f\9) 
Pr(z,f\u,i,9 x ) 



(19) 



Therefore, instead of maximizing C(8), the EM algorithm 
tries to find model parameters x +i to maximize Q(0\8 X ). 
Therefore, we can drop constant terms w.r.t. 9 as 

6 x+ i = argma,x{Q(9\9 x )} 



= axgmax< ^ ^ Pr(z, f\u, i, 6 X ) logPr(u, i, z, f\9) 
{{u,i)eH z,f 

= argmaxJ ^ E a ,/| ttij , fla ,{logPr(u, i, z, f\8)} > 



Therefore, the EM algorithm consists iterating: 



(20) 



1. E-step: Determine the conditional expectation in Equa- 
tion poll . 

2. M-step: Maximize this expectation with respect to 9. 

The E-step needs to find the posterior probabilities in 
Equation (|20l) . which is computing Pr(z, f\u, i, 9 X ). Because 
these probabilities assume model parameters are known as 
9 X , we have: 



Pr(z,f\u,i,9 x ) 

_ Pr(z)Pr(/jz) Pr(u\f) Pi(i\z) 

" £ 2SZ E/ 6 f(«) Pr W Pr (/I z ) Pr H/) Pr (^) 



(21) 



where the right hand side of Equation (|21|l only consists of 
the parameters in 9 X . 

In the M-step, we need to find model parameters to max- 
imize Equation 1)201) . Firstly, we can break up the term 
logPr(u, i, z, f\6) according to Equation (JS| as: 



log Pr(u, f, z, i\t 



(22) 



= logPr(z) + logPr(«|/) +logPr(/|z) + logPr(i|«) 
Plug Equation (|22|l in the the expectation Equation (|20l) 



and follow standard calculations, we have: 
x+ i = argmax 

5>gPr(z).( E Pr(*,/V,0] + 

pr(u\f) l E Pr ( z '> f\ u > i ') I + 

^logPr(/|z)- I Pr(z,f\u',i')\ + 

f,z \(u>,i')£HAf£F(u') J 



5>gPr(i|*)- E E Pr (^/V, 

i,z 



(23) 



In Equation l23l each model parameters are separated into 
different inner products. For example, terms related to 
Pr(z) is the inner product of logPr(z)Vz 6 Z with corre- 
sponding posterior sums. Recall that we always have the 
probability constrain that Pr(z) = 1, to maximize the 
inner product, Pr + (z) should be chosen so that the Pr(z) 
vector is at the same "direction" as the summed posteri- 
ors. In other words, Pr + (z) should be proportional to the 
corresponding summed Pr(z,/|u,i) 0. Doing the similar 
maximization to all the model parameters, we can find the 
9 X+1 = {Pr+(z),Pr+H/),Pr+(/|z),Pr+(i|z)} as: 

Pr(z)cx J2 E Pr ( 2 < /V>0 (24a) 

(u',i')€H f'eF(u') 

PrH/)oc J2 E p r(^',/hO (24b) 

(u,i')£HAf€F(u) z'GZ 



Pr(/|z)oc Pr(z,f\u',i') 

(u' ,i')GHAfeF(u') 

Pr(t»oc J2 E P*(*,fW,i) 

{u>,i)£H f'eF(u') 



(24c) 
(24d) 



5 This vector maximization method is still an approxima- 
tion, but this approximation is usually good enough and 
also adopted in |14] 



